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Abstract 

We introduce an statistical mechanical formalism for the study 
of discrete-time stochastic processes with which we prove: (i) General 
properties of extremal chains, including triviality on the tail cr-algebra, 
short-range correlations, realization via infinite-volume limits and er- 
godicity. (ii) Two new sufficient conditions for the uniqueness of the 
consistent chain. The first one is a transcription of a criterion due to 
Georgii for one-dimensional Gibbs measures, and the second one cor- 
responds to Dobrushin criterion in statistical mechanics, (iii) Results 
on loss of memory and mixing properties for chains in the Dobrushin 
regime. These results are complementary of those existing in the lit- 
erature, and generalize the Markovian results based on the Dobrushin 
ergodic coefficient. 

1 Introduction 

Chains with complete connections is the name coined by Onicescu and Mihoc 
(1935) for discrete-time stochastic processes whose dependence on the past 
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is not necessarily Markovian. The theory of these processes has many points 
in common with the theory of Gibbs measures in statistical mechanics - 
particularly, the existence of phase transitions. Nevertheless there is a clear 
difference, at the formal level, between both theories. Indeed, processes 
are described in terms of single-site transition probabilities, while Gibbs 
measures are characterized by their conditional probabilities for arbitrary 
finite regions (specifications). In this paper we propose a natural way to 
reduce this asymmetry, by introducing a statistical-mechanical framework 
for the study of processes. This framework establishes a more direct relation 
between both theories, which allows us to reproduce, for chains with complete 
connections, a number of benchmark Gibbsian results. 

We present three types of results. First, we obtain general properties 
of extremal chains for any type of alphabet, namely triviality on the tail 
a-algebra, short-range correlations, realization via infinite-volume limits and 
ergodicity. Second, we produce some new sufficient conditions for the unique- 
ness of the consistent chain. On the one hand, we obtain a transcription of 
a criterion given by Georgii (1974) for one-dimensional Gibbs fields. This 
criterion is known to be optimal for the latter, in the sense that it pinpoints 
the absence of phase transition for two-body spin models with a l/r 2+£ - 
interaction, for all e > 0. The criterion imposes no restriction on the type 
of alphabet. On the other hand we prove a "one-sided" Dobrushin crite- 
rion, which corresponds to a well known uniqueness criterion in statistical 
mechanics (see, for instance, Simon, 1993, Chapter V). This criterion is valid 
for systems with a compact metric alphabet. We exhibit simple examples 
where Dobrushin criterion applies but that fall outside the scope of most 
other known uniqueness criteria (Harris, 1955; Iosifescu and Spataru, 1973; 
Walters, 1955; Berbee, 1987; Stenflo, 2002; Johansson and Oberg, 2002). 

Our third type of results refer to loss of memory and mixing properties 
of chains in the Dobrushin regime. Our results, obtained along the lines of 
a similar Gibbsian theory (again we refer the reader to Chapter V of Si- 
mon, 1993), are complementary, both in their precision and in their range 
of applicability, to similar results available in the literature (Iosifescu, 1992; 
Bressaud, Fernandez and Galves, 1999 and references therein). The results 
depend on a sensitivity matrix that generalizes the Dobrushin ergodic coef- 
ficient of Markov chains. 

Our approach is based on a notion analogous to the specifications in 
statistical mechanics, which we call left interval-specifications (LIS). These 
are kernels for regions in the form of intervals which depend on the preceding 
history of the process. In contrast, Gibbsian specifications involve arbitrary 
finite regions and depend of the configuration on the whole exterior of the 
region. This amounts, in one dimension, to a dependence on both past and 
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future. The difference is, of course, a consequence of the "one-sidedness" 
associated to a stochastic (time) evolution, as compared with the lack of 
favored direction in the spatial description provides by a Gibbs measure. 

The description in terms of LIS is totally equivalent to the traditional 
description in terms of transition probabilities (=LIS singletons). We show 
this in our first theorem. But, as this paper illustrates, our approach has 
the advantage of allowing us to "import", in a natural manner, notions, 
techniques and arguments from statistical mechanics. It may also be use- 
ful in the opposite direction, namely to explore the consequences of known 
properties of chains for the theory of Gibbs measures. As a step in this 
direction, in a companion paper (Fernandez and Maillard, 2003) we study 
conditions under which chains and Gibbs measures can be identified. On a 
more conceptual level, we believe that our statistical mechanical approach is 
more appropriate to study the general situation where several different chains 
are consistent with the same transition probabilities (Bramson and Kalikow 
(1993), or Lacroix (2000)). Statistical mechanics is the framework developed, 
precisely, to study this phenomenon which corresponds to the appearance of 
(first-order) phase transitions. 

2 Preliminaries 

We consider a measurable space (E, £) and a subset Q C E z . The exponent Z 
stands, in fact, for any countable set with a total order. The group structure 
of Z will play no role, except in Theorem 13 . 91 where Z acts by isomorphisms. 
The elements of Z are called sites, and those of Q (admissible) configurations. 
The space E is sometimes called alphabet. We endow Q with the projection 
T of the product a-algebra associated to E % . When we invoke topological 
notions (e.g. compactness) the a-algebra £ is assumed to be Borelian. We 
adopt the following notation 

• Let A C Z. For a configuration a G E z we denote <7a = (<7i)j eA G E A . 
The set of admissible configurations in A is — {o~a G E a : 3cu G 
Q with cua = o"a } ; while JF A is the sub-u-algebra of T generated by the 
cylinders with base in Q^. If A C Z with A fl A = 0, cj a ca denotes 
the configuration on A U A coinciding with uji for i 6 A and with <7j for 
% G A. 

• We denote S\, the set of finite intervals of Z. When A = [k, n] G Sb 
we shall also use the "sequence" notation: = 0J[h, n ] — ^ki ■ ■ ■ i^n] 
VLl = Q[k,n]', etc. If A = [k, +oo[, the notation will be analogous but 
with +oo as upper limit. 
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• If 7i G Z, jF< n = ^i_oo )n ]. For every A E S b we denote l\ = minA; 
m A = max A; A_ =] — oo, Z A — 1]. 

• For kernels associated to a LIS (defined below), lim A -|-y f\ is the limit 
of the net {/ A , {A} A& s b , A C V, c}, for V an infinite interval of Z. If 
fj, a measure on (fl, J 7 ) and ft, a JF-measurable function, we will write 
/i(ft) instead of E^{h). 

Definition 2.1 (LIS) 

A left interval-specification / on (Q, J 7 ) is a family of probability kernels 
{/a} a& v ^ a : x fi — *• [°> !] suci ti3at for a]i A izi 

(a) For each A G J r < mA , f\(A\ ■) is -measurable. 

(b) For each i? G >F A _ and G fi, / A (-B | = 1b (w). 

(c) For each A G S 6 : A D A, 

/a/a = /a on .F< mA , (2.2) 

that is, (/a/a) (ft I = /a (ft I fo r each f< mA -measurable function 
ft and configuration uj G f2. 

These conditions are analogous to those defining a specification in the 
theory of Gibbs measures (see Georgii, 1988, for instance). Two important 
differences should be highlighted, however, both being a consequence of the 
"directional" character of the notion of process. First, the LIS kernels act 
only on functions measurables towards the left, while Gibbsian specifica- 
tions have no similar constraint. As a consequence, LIS kernels involve only 
conditioning with respect to the past [property (b)], while Gibbsian kernels 
condition with respect to the whole exterior of A. Second, LIS kernels are 
defined only for intervals while Gibbsian kernels are defined for all finite sets 
of sites. 

Property c) is usually labeled consistency. There and in the sequel we 
adopt the standard notation for a composition of probability kernels or of a 
probability kernel with a measure. Explicitly, (|2.2jl means that 

/ f HO / A (d£ | a) U(da | u) = J h(a) f A (da | u) 
for each jF< mA -measurable function ft and configuration u£fl. 
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Definition 2.3 (left interval-consistency) 

A probability measure \x on (f2, T) is said to be consistent with a LIS f if 
for each A e S b 



Such a measure [i is called a chain with complete connections, or simply 
a chain, consistent with the LIS f . The family of these measures will be 
denoted G(f)- 

Remarks 

2.5 A Markov LIS of range k is a LIS such that each function f\(A | •) 
is measurable with respect to J r [i A -k,i A -i], for each A e J- a- A chain 
consistent with such a LIS is a Markov chain of range k. 

2.6 Chains with complete connections is the original nomenclature intro- 
duced by Onicescu and Mihoc (1935) . These objects have been later 
reintroduced under a panoply of names, some associated to particular 
additional properties, others to notions later proven to be equivalent. 
Among them we mention: chains of infinite order (Harris, 1955), g- 
measures (Keane, 1972), list processes (Lalley, 1986), uniform martin- 
gales or random Markov processes (Kalikow, 1990). 

3 Results on general framework 

We start by making the connection with the traditional definition of chains 
based on singleton kernels. 

Theorem 3.1 (Singleton consistency for chains) 

Let (fi) ieZ be a family of probability kernels fi : T<i x il — > [0, 1] such that 
for each i £ Z 

(a) For each A e T<^ fi(A\ • ) is jF<j_i-measurabIe. 

(b) For each B e J-<i-i and u & Q, fi(B \ uj) — 1b(lo). 
Then the LIS f = {7a} a&s defined by 



ji on T< mK . 



(2.4) 



/A — fl A flA+l ' ' ' fl 



(3.2) 



is the unique LIS such that f^y = f\ for all i 6 Z. Furthermore, 




(3.3) 
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In particular, the theorem shows that any LIS / enjoys the factorization 
property 



for any I, n, m G Z with I < n < m. 

The following three theorems establish relations among extremality, triv- 
iality, mixing properties and infinite-volume limits similar to those valid for 
Gibbs measures or, more generally, for measures consistent with specifica- 
tions. Their proofs, presented in Section |Hl are patterned on the Gibbsian 
proofs, taking care of the one-sided measurability of the LIS kernels. 

Theorem 3.6 (Extremality and triviality) 

Let / = (/a)a&s be a left interval-specification on (O, T\ Denote by J--00 = 
flfcez F<k the tail cr-algebra. Then 

(a) G(f) is a convex set. 

(b) A measure \x is extreme in G(f) if and only if fi is trivial on T-^. 

(c) Let fj, G G(f) and v G V(Q, J 7 ) such that v <C /x. Then v G G(f) if and 
only if there exists a J r _ 00 -measurable function h > such that v = hfi. 

(d) Each /j G G(f) is uniquely determined (within G(f)) by its restriction 
to the tail a-algebra T-^. 

(e) Two distinct extreme elements /j,,v of G{f) are mutually singular on 



Theorem 3.7 (Triviality and short-range correlations) 

For each probability measure on (O, JF) , the following statements are equiv- 
alent. 

(a) /j is trivial on T-oq. 



/a - f{i A } f{i A +i} ■ ■ ■ f{m A } 
on T< mK for each A G <S&. By recurrence this yields 



(3.4) 




f[l,n] f[n+l,m] 



(3.5) 



(b) 



lim sup 



^(A n B) 



fi(A)fi(B) |= 0, for all cylinder sets A in T. 



(c) 



lim sup 



fi(A n B) 



fi(A)n{B) |=0, for all A eF. 
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Theorem 3.8 (Infinite volume limits) 

Let f be a LIS, /i an extreme point ofQ(f) and (A„) n>1 a sequence of regions 
in S b such that A n f Z. Then 

(a) /a„/i — > /i-a.s. for each bounded local function h onVt 

(b) Jf VL is a compact metric space, then for fi-almost all uj G VL, f\ n h — > 
jj,{h) for all continuous local functions h on Q. 

The following theorem is the only result in the paper where we consider 
translation invariance. We briefly recall the relevant notions. We consider 
the (right) shift r(i) — % + 1. (More generally, the same theory applies to 
any action of Z on Z by isomorphisms. In the case of fc-shifts such theory 
leads to /^-periodic objects). The shift induces actions on configurations, 
measurable sets, measurable functions and measures that we denote with the 
same symbol: for uj G Q t{uj) = (wj-i) ieZ , for A G T , rA = {tuj : uj G A}; 
for h jF-measurable, (rh)(uj) = h(r~ l uj), and for a measure /i on (fi,jF), 
(r/j,)(h) = ^{r^h). Objects invariant under the action of the shift are called 
shift-invariant. We denote X the a-algebra of all shift-invariant measurable 
sets, and P inv (f2, JF) the set of shift-invariant probability measures on J 7 ). 
A measure in P inv (f2, T) is ergodic if it is trivial on J. 

For k G Z and A C Z we denote A + k = {i + k : % G A}. A LIS / is 
shift-invariant or stationary if 

f A+1 (rA I tuj) = f A (A | uj) 

for each A G Sb and uj G f2. We denote ^i nv (/) the family of shift-invariant 
chains consistent with a LIS /. 

Theorem 3.9 (Ergodic chains) 

Let f be a shift-invariant LIS. 

(a) A chain /i G Gmv(f) Is extreme in Qi nv (f) if and only if fi is ergodic. 

(b) Let fi G Ginv(f)- Ifvt Vi nv (Sl, T) is such that v < /i, then z/ G Gmv(f)- 

(c) Ginv(f) is a face ofVmviVL,^). More precisely, if fi, v G Vi nv (fl,J-') and 
< s < 1 are such that s// + (1 — s) v G Gmv(f) then /i, z/ G Gmv(f)- 

4 Uniqueness results 

We shall prove two types of uniqueness results. We start with the coun- 
terpart of a criterion proven by Georgii (1974) for measures determined by 
specifications. 
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Theorem 4.1 (One-sided boundary-uniformity) 

Let f be a LIS for which there exists a constant c > satisfying the following 
property: For every m e Z and every cylinder set A 6 -T 7 !!^ there exists an 
integer n < m such that 

f[n,m]( A I > cf[n,m]( A I ^ f ° r ^11^,7] E Q . (4.2) 

Tien tliere exists at most one chain consistent with f . 

The main virtue of this criterion is its generality. Existing uniqueness 
criteria (Harris, 1955; Iosifescu and Spataru, 1973; Walters, 1955; Berbee, 
1987; Stenflo, 2002; Johansson and Oberg, 2002) require that the space E 
have particular properties (finite, countable, compact), and that the kernels 
satisfy appropriate non-nullness hypotheses. Many of these criteria are based 
on summability properties of the sequence of variations: 

vaxj(f {{} ) = 

sup{|/w(6 I Coo) - f{i}(Vi I Ol --^V € tf_ oo, Cj = (4-3) 

for j < i. 
Proposition 4.4 

Assume that E is a countable set and £ the discrete o-algebra. A LIS f 
satisfies the one-sided boundary- uniformity condition A4.2\) if it is uniformly 
non-null. 

inf inf / W («i|w^) > 0, (4.5) 

and satisfies 

sup V var n < +oo . (4.6) 

We observe that when / is stationary the last condition amounts to summable 
variations: ^j<o var j (/{o}) < +oo. 

Our second type of uniqueness result corresponds to the Dobrushin cri- 
terion for specifications. The required mathematical setting is richer. We 
choose a bounded distance d on E and take £ as the associated Borel <j- 
algebra. We endow E z with the product topology (so T is also Borel) and 
Q C E z with the restricted topology. The choice of distance is dictated 
by the type of measures to be analyzed. For finite, or countable, alphabets 
the canonical choice is the discrete distance <idisc( a ; b) = 1 if a ^ b and 
otherwise. 
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Definition 4.7 

A LIS on f on is continuous if the functions Q 3 uo — ► f\(A \ uo) 

are continuous for all A e S b and all A e T^. 

In the case of specifications, continuity is associated with Gibbsianness 
(non-nullness is also needed, see, e.g., the discussion in Section 2.3.3 in van 
Enter, Fernandez and Sokal, 1993). For E finite, continuity is equivalent to 
lim J -_>_ 00 vaij (f {i} ) = 0. 

Remark 4.8 

If the LIS / is continuous and the space Q is compact, then there always 
exists at least one compatible chain. Indeed, the probability measures on 
a compact space form a (weakly) compact set. Hence, if (A n ) n6 N C S b 
is any exhausting sequence of regions and (u/ n )) ne N C f2 any sequence of 
pasts, the sequence of measures 

/a„( • I wW), n e N, has some accumulation 
point. Continuity ensures that such a limit belongs to G(f)- Therefore, for 
continuous LIS on a compact space of configurations, the following theorems 
determine conditions for the existence of exactly one compatible measure. 

For every i e Z and every jF< r measurable function h, the (/-oscillation 
of h with respect to the site j < i, is defined by 

^» 4su 4^r : ^ e ^^-''}' (4 - 9) 

with the convention 0/0 = and where we introduced the notation 

i = V ii = rh,Vi±3 (4-10) 

("^ equal to rj off j"). We introduce also the space of functions of bounded 
d- os dilations: 

&d — \ ^-measurable h : sup 5^(h) < oo \ , (4-H) 

and its restrictions 

B d (A) = |/i G B d : h jF A -measurable | 

for A C Z. The most general version of Dobrushin's strategy allows the use 
of a "pavement" of Z by finite intervals. These intervals V must be chosen 
so that there is an appropriate control of the "sensitivity" of the averages fy 
to the configuration in V-. 
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Definition 4.12 (d-sensitivity estimator) 

Let V G Sb and f v a probability kernel on T< mv x Q. A rf-sensitivity 
estimator for f v is a nonnegative matrix a v = (aYi) . . „ such that af, = 
ifi^V or j ^ V_ and 

%(fvh) < £<^)o£ ( 4 - 13 ) 

iev 

for all j G V- and Ty -measurable functions h G Bd- 
Theorem 4.14 (One-sided Dobrushin) 

Let f be a continuous LIS. If there exist a countable partition VofL into 
Gnite intervals such that for each V & V there exists a d-sensitivity estimator 
a v for f v with 

E < < 1 ( 4 - 15 ) 

jev- 

for all i G Z, then there exists at most one chain consistent with f . 

In particular, the partition can be trivial, namely V = {{i} : i G Zj. 
In the stationary case, the estimators for such a partition are of the form 
aff = a(i—j) for a certain function a on the integers that takes value zero for 
non-positive integers. Dobrushin criterion becomes, then, Y2 n>1 oi(—n) < 1. 

The customary way to construct d-sensitivity estimators for kernels fy is 
resorting to the Vaserstein-Kantorovich-Rubinstein (VKR) distance between 
measures on Ty for the distance dy (ov, ay) — Yliev d a *)- ^ we denote 

o 

fv the projection of each kernel fy over Q v : 

fv (A | lu 1 ^ 1 ) 4 fy({ay G A} | lu 1 ^ 1 ), V A G T v , Vc^ 1 G fiiT" 1 , 
then the VKR distances between these projections are 

M-l&VM-h-* 1 ) 

sup (h | /v fa | t 1 )) : h G B d (V) , osc v (h) < l} 

(4.16) 
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where oscy(/i) = sup{ \h(cry) — h{uj v )\/ dy{ay , u>v)}- Equivalently (see, for 
instance, Dudley, 2002, Section 11.8), 



fv {■ \ i 1 ^ 1 ) ~ fv {■ \ V 1 ^ 1 ) 

inf | J d(a v , 0Jv) p{d&v-, dujy) : p G V(Q x Q) 

o o 

with marginals f v (• | fj^ 1 ) and fv( - \ V-^ 1 ) \ ■ G 4 - 1 ?) 
The VKR (canonical) (^-estimator is defined by the coefficients 

M-IO-M-U) 



cUf) 



sup 

S = V 



ieVjeV- (4.18) 



and C*j(f) = otherwise. 



If the partition is trivial and d is the discrete metric, each 



coincides 



with the variational norm. If the alphabet E is countable, this means 

(4.19) 

and a sufficient condition for Dobrushin criterion ()4.2|) is, therefore, 

X>CA) < i, *ez. (4-20) 

j<i 

Besides the absence of non-nullness hypotheses, an advantage of Dobrushin 
criterion is that it determines a regime where mixing properties can be de- 
termined, as we discuss in next section. 

To conclude, we remark that in fact the two uniqueness criteria given in 
Theorems 14.11 and 14.141 give a very strong form of uniqueness. 

Definition 4.21 (HUC) 

A LIS f on (fi,J^) satisfies a hereditary uniqueness condition (HUC) if 
for all intervals of the form T = [k, +oo[, k G Z, and configurations uj G Vt, 
the LIS defined by 



A G S b , A c r 



(4.22) 



admits at most one consistent unique chain. 
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The two criteria given above involve bounds valid for all past conditions. 
They remain, therefore, valid if only particular pasts are considered as in 
1)4.22)1 . This observation proofs the following corollary. 

Corollary 4.23 

If a LIS satisfies the hypotheses of either Theorem \4.1\ or Theorem \4.14\ then 
it also satisfies a HUC. 

We remark that, for similar reasons, the criteria of Harris (1955), Stenflo 
(2002) and Johansson and Oberg (2002) also imply the validity of a HUC. 

5 Results on loss of memory and mixing prop- 
erties 

We place ourselves in the framework needed for the one-sided Dobrushin 
criterion — E with a topology defined by a bounded metric d, £ its Borel 
ex-algebra, Q topologized with the restricted product topology — and take 
up all the related notions — rf-oscillations, functions of bounded oscillations, 
sensitivity estimators. To improve readability, we write the results only for a 
trivial partition V. Versions for more general partitions, of potential interest 
for coarse-graining arguments, can be obtained in a straightforward manner 
from our proofs by replacing sites by blocks of sites. 

Definition 5.1 

A c?-sensitivity matrix for a LIS f is a matrix of the form 



where each oq- is a d-sensitivity estimator for fa, i G Z. 
Theorem 5.3 (Loss of memory) 

Let f be a continuous LIS and (a^) a d-sensitivity matrix for f . Then, 
(i) For every A e <S b , j < and h e £><f(A), 




(5.2) 




J kj 



(5.4) 
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(ii) Assume that there exist a function F : Z 2 — ► R + satisfying the tri- 
angular inequality F(i,j) < F(i,k) + F(k,j) Wi,j,k G Z such that 

1% = ^^./M < i, (5.5) 

j<i 

for each i G Z. Then, for each A G h G i3^(A) and j < /a. 

with 7 A = max i6A 7i- 
Remarks 

5.7 In the Markovian case onj — if \i — j\ > 1. Then expression (|5.4jl 
implies that for ft, G .F{ n } 

^i(/[o,n](/0) < 7" TO (5-8) 

with 7 = supj^ -ay. For d discrete and estimators (J4.18j) . 7 is known 
as the Dobrushin ergodic coefficient. If, in addition, E is countable, 
Q = E z and / shift-invariant, then 

7 = 1- min V / {0 }(^o I A /{ }(^o I W-i) • (5.9) 

5.10 If the alphabet E is countable and the metric discrete we can use the 
estimators ([4. 19)1 . With this choice, ()5.6|) implies 

Sj(fi) < const e- F(ij) 

=*► 5- n [f [0M (A)} < const e - F (™>-") , Ae T {m} . (5.11) 

Published loss-of-memory results (Iosifescu, 1992; Bressaud, Fernandez 
and Galves, 1999) resort instead to the variations (|4.3J) . Comparisons can 
only be made through the obvious inequalities 

Sj[f{i}(h)} < vaxMwih)] < £>[/«0O]- 

k<j 

For LIS with an exponentially decaying dependence on the past, ()5.11|) im- 
plies an exponential loss of memory with an identical rate, in terms either 
of oscillations or of variations. This should be contrasted with the results 
in Bressaud, Fernandez and Galves (1999) where there is an infinitesimal 
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loss of rate. LIS with a power-law dependence can be treated by taking 
F(i,j) = clog(l + \i — In terms of variations, the loss of memory im- 
plied by (jS.llj) is also a power law but with a power decreased by one unit. 
Bressaud, Fernandez and Galves (1999) obtain, instead, the same power. 

Furthermore, it is relatively simple to construct examples falling outside 
the scope of all preexisting loss-of-memory results, but for which Theorem 
15.31 applies. Consider, for instance, the 2-letter alphabet E = {0, 1} and a 
shift- invartiant LIS defined by singletons 

/(w = l|t*;Zy = ^diUi, (5.12) 

i<0 

for a sequence {aj}j<o of non-negative numbers. The estimators ()4.19j) yield 
a sensitivity matrix 

a ij = $j (/{*}) = a i-j {5.13) 
for i > j, and zero otherwise. Theorem (|5.3|) is therefore applicable as long 
as ^2 i<0 di < 1. On the other hand, for each < e < 1, the choice 

1 -e 1 / \ 

°- = (5 - 14) 

with M e = J2k>i k~ {1+£ \ satisfies 

vaxnifux) > — 

3\JW - (i-j- iy 

for % — j > 2. Thus, this LIS is not covered by the results of Iosifescu (1992) 
or of Bressaud, Fernandez and Galves (1999). It also does not satisfy any 
uniqueness criteria except one-sided Dobrushin's. 

The following mixing results form the LIS version of a well known chap- 
ter in the theory for Gibbs measures (see, for example, chapter V in Simon, 
1993). Their proofs, presented in Section |HJ follow the guidelines of the 
statistical mechanical proofs. They require a compact Q. We observe that 
example ()5.12j) - (j5.14|) shows that our results are complementary to those ex- 
isting in the literature, which are based on variations rather than oscillations 
(Bressaud, Fernandez and Galves, 1999, and references therein). 

Theorem 5.15 

Assume Q compact and let f and f be two LIS on (fl, J 7 ) with f continuous 
and with a unique consistent measure. Assume also that for each i £ Z there 
exists a measurable function hi on Q such that 

o 

f{i}(-\w)-fi(-\uj) < bi(u) (5.16) 

d 



14 



for every configuration uj G ^!_^,. Then, for all /i G Q(f), /i G Q(f) and 

AeS b 

\»(h)-m\< E t(h)6 d k (f [k+1 , mA] h) (5.17) 
fceAuA_ 

for every h G B d (A). 

Let us denote D = sup^g^ rf(x, y) and for a measure on T and JF- 
measurable functions hi and /i 2 



Cor^ (hi, h 2 ) = fJ>(hih 2 ) - fJL(hi)(x(h 2 ) 



Theorem 5.18 

Assume Q compact and let f be a LIS on (ft, T) that is continuous and and 
with a unique consistent measure. Let fi be the unique probability measure 
in Q(f). Then for every A, A G Sb such that m& < l A , 

D 2 

Cor^(hi,h 2 ) < — Yl S k (f[k+i,m A ]hi) 5 d k (f [k+ i, mA ]h 2 ) (5.19) 

k<m& 

for all functions hi G £>^(A) and h 2 G Bd(] — oo, m&\). 

Next corollary offers a more quantitative consequence of this theorem. 
For all A G Sb we define the A-projection 



(Pa 



} kj 



1 if k — j and k G A 
otherwise . 



For a matrix (A k j) k . gZ with nonnegative entries, we denote 



A 



I- A 



5> 



k i n>l 

These are well-defined sums on [0, +oo]. 



(5.20) 



Corollary 5.21 

Consider the hypotheses of the previous theorem and let (a^) be a d-sensitivity 
matrix for f. 



(i) If /ii G B d (A) and /i 2 € #d(] - oo,m A ]), 



JJ 2 



-PaQ! 
1-P A « 



(5.22) 
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(ii) Ifhx eB d (A) and h 2 eB d (A), 



D 



Cor, (hx, h 2 ) < — J2 E Stfa) A ml , (5.23) 



leA meA 



where 



A 



ml 



P A a 



1 - P A a 



ml k<m A 



P A a 



l-P A a 



mk 



P[k+l,m A ] OC 
1 — P[k+l,m A ] Oi 



Ik 



The following proposition is useful to estimate the different matrices ap- 
pearing in this corollary. 

Proposition 5.24 

If (ctij) is a matrix satisfying 1)5.5)) . then for each A E S b 



-Pa « 
I -Pa a 



< 



^ A e -F(k,j) 



1-1 



1 - 7a 



(5.25) 



6 Proofs for the general framework 
6.1 Singleton consistency for chains 

The fact that the objects defined by ()3.2j) are kernels from T< mK x Q to the 
interval [0, 1] follows immediately from the properties of the kernels /j. Their 
normalization is proven by induction, using the fact that 

I ■) = f{i}(P<i | ■) = 1 

and the inductive step 

f A ({l< mA \uj) = /[i A , mA _i]f(/ mA (n< mA ) u) = f[l A ,m A -l] (l | u) = 1, 
for oj G fi< iA . 

Properties (a) and (b) of the definition 12.11 of LIS are an immediate con- 
sequence of similar properties of the kernels /j. To prove consistency, we first 
remark that for I < m < p, oj G Vt and any ^-"< p -measurable function h, 



10 



(f[l,m\ f[l,p])(h \ Uj) = fll,m](f[la>](h>) 

= f[i, P ]( h I u ) fliMi 1 I u ) 
= f[i lP ](h\u). 



(6.1) 
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The second equality is due to the proven property (b) of Definition 12.11 plus 
the fact that /p jP ] (h | • ) is jF^^-measurable. The last equality is the just 
proven normalization. Identity (|6.1J) justifies the last equality in the following 
string of identities, valid for I < m < p, 

f[l,p] f[l,m] — f[l,m] f[m+l,p] f[l,m] = f[l,m] f[l,p] = f[l,p] ■ (6-2) 

The other equalities are simply due to definition 13.21 A similar identity is 
trivially true for I < m = p. Consistency follows for, if A D A: 

/a /a = J[Ia,Ia-1] f[lA,m A ] f[l A ,m A ] = /[JaA-1] /[/A,m A ] = /a • 

We used (16. 2|) in the middle identity and we assumed I a < I a, otherwise we 
revert to ()6.2|) . 

The remainder of the proof relies on the following observation valid for 
any measure p on T and anyA G Sw- 

\i fi = p , Wi G A =^ nfh = H- (6.3) 

This is proven by induction on the cardinality of A through the identity 

^ /a = ^ fl A /[/ a +1,"ia] = ^ /['a+I.tha] • 

Property ()6.3)1 directly proves the non-trivial inclusion in ()3.3)1 . Further- 
more, it yields uniqueness. Indeed, consider a LIS {gK)^ eSb consistent with 
the family (/i) igZ . By ()6.3|) #a must be consistent with f\ for each A G <S&. 
But then, if u; G f2 and /i is JF< mA -measurable 

g A (h | cj) = #a(7a0) u?j = f A (h | w)^ A (l | = /a(^ | u) • 

The second identity is a consequence of the J-/ A _i-measurability of f\(h\ - ) 
plus property (b) of Definition 12.11 The last equality is the normalization of 
9A- □ 



6.2 Extreme chains 

We start with general results on probability kernels. 
Proposition 6.4 

Let B be a sub-a-algebra of T , tt a probability kernel on B x Q and p G 
such that pn = p on B. Then: 
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(i) The system 

T*(H) ± [A G B : n(A I • ) = 1 A ( ■ ) /i-a.s.} (6.5) 

is a a-algebra. 
(ii) For all £>- measurable functions h : ft — > [0, +oo[ , 

(/i/i) n = hfi on B if and only if h is Zf (/i) -measurable . (6.6) 

Proof 

(i) Clearly ft G 2^(/i). For each A G T*(/x), 

tt(A c I • ) = 1 - ti(A | • ) = 1 - 1a (/i-a.s.) = II a- (/i-a.s.) . 
Likewise, for each sequence (A n ) neN of disjoint sets in (/i), 

tt(uA„ I •) = ^7r(A n | •) = (/i-a.s.) = t uAn (/i-a.s.) . 

neN neN 

Finally, if A, 5 G T*(/x), then 

n(Ar\B\-) < tt(A I •) A-k(B \ •) = 1 a /\^b (/i-a.s.) = 1 ACi b (/i-a.s.) 
and, by the consistency of /i with 7r, 

^Iahb -7r(AnS | •)) = M^ 4 nS)- /i7r(A nB) = 0. 

Thus 

n(A D B) = l Ar ,B /i-a.s. 

(ii) Let us assume that (h/i) n = h/j, on B. To prove necessity it suffices to 
show that {h > c} G Zf (/i), for all c > 0. Let us fix some c > and denote 
g = t h > c . We have 

-9)hTr(g)^ = (hfi)(ir(g)) - n(g hir(g)) = (h/i)(g) - /i(g hn(g)) 

= »(gh(l-ir(g))) . 

But g/i > eg and 1 — 7r(<?) > 0, hence 

/i((l - g)hTr(g^) > c/i (g(l - tt(#))) = c/i(ir(g)) - c/i(gir(g)) 

= cn({l -g)ir(g)j . 
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We obtain that \i (t{h<c} (h — c) ir(g)) > 0, which implies U{^ <c i7r(^) = \i- 
a.s. Therefore, 

n(9) = gn^g) + ^{h<c}K(g) < g /i-a.s. 

Furthermore, fi(g — ir(g)) = by the consistency of /a with n. This fact, 
together with the previous inequality, allow us to conclude that ir(g) = g 
/z-a.s., that is {h > c] £ I^(fJt). 

Conversely, assume that h is 1^ (/^-measurable. By the standard machin- 
ery of measure theory sufficiency follows if we show for all A £ (/a) that 
(U A fi)n = ft on B.liB e B, 

(t AfJl )ir{B) = (t A fi)n(AnB) + (t Al x)7r(B\A) 
< fX7i(AnB) + (t A ^)n( y A c ) . 

The consistency of fi with tt implies that the second term of the last line is 
zero. Thus we have proved that 

(t A fx)n(B) < (t A fi)(B) . (6.7) 

By the same token, 

(l Af i)7r(B c ) < (l A fi)(B c ) . (6.8) 

But the consistency of /i with 7r implies that the sum of the LHS of ()6.7j) and 
()6.8j) equals the sum of the corresponding RHS, namely fi(A). We conclude 
that (U A /i) ir(B) = (t A fj)(B). □ 

Corollary 6.9 

Let I! be a non-empty set of probability kernels it defined on T v x Q, where 
T-x is a sub-o -algebra of T . Let us denote 

G{U) = |/i £ V{Vt,T) : fjnr = fi onFv for all vr £ II j (6.10) 

and for each /t £ G(Jl), 

Trgn 

be the a-algebra of all fi-almost surely U-invariant sets. Then fi is trivial on 
In (a*) if A 4 JS extreme in G(H). 
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Proof 

Suppose /J is not trivial on X n (/i) and take A G X n (/i) such that < fi(A) < 1. 
The measures 

v = fi{ • | A) = /i \x with /t 1 



and 

i/' = u( • I A c ) = aVu with /i' 

^ v 1 ' ^ fi(A c ) 

satisfy v ^ v' and fi = fi(A) v + fi(A c ) v' . The functions /i and h! are 
X^ (/immeasurable, for all 7r G IT. Thus, (ii) of Proposition 16.41 implies that 
^, v' G {?(IT), a fact that contradicts the extremality of \x. □ 

Lemma 6.12 

Let f be a LIS defined on (O, J 7 ) and /i G {?(/). Let us denote the 
fi-completion of F-oq. Then 

n>0 

for eacn G Z and 

Ae5 b 

Proof 

______ 

Identity ()6.13|) follows from the observation that for each B G f] n ^f~ ^(/-O 

the set A = f] n {f[k- n ,k](B \ ■ ) = 1} satisfies A = B /i-a.s. and A G J^-oo. 
Equality (j6.14j) is a consequence of (j6.13|) because 

Ag5 6 fcGZ n>0 



Proof of Theorem 13.61 

(a) It is immediate. 

(b) (=>) The implication follows readily from Corollary 16 . 91 and the fact that, 
by ()6.14|) . f > | Ae<Sb X^ mA (/i) is //-trivial if and only if /i is trivial on T-^. 

(c) (=>•) Let /i, z/ G such that u <^ fi. There exists a jF-measurable 
non-negative function g such that 

v = g/i. 
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Let us consider, for each k G Z uu = a _ and za. = u _ . As in particular 

^fe <C /Ufc on J-"<k, there exists > 0, .F< fc -measurable, satisfying = gk^k 
on JF< fc ). All we have to prove is that 

gk is jF^-measurable V/c G Z . (6.15) 

Indeed, by the reverse martingale theorem gj. = g //-a.s. Therefore, g inherits 
the JF^-measur ability and, thus, it is /i-a.s. equal to a JF_ ^-measurable 
function. 

To prove (|6.15J) we observe that since v G G(f), 

gk f[k-n,k] = gk ^k 

on T<k for all n G N. As ^ is jF< fc -measurable, we conclude from Propo- 
sition 16.41 that is H Z, - k (u)-measurable. Its, JF^ 1 -measurability fol- 

1 ' n J[k — n,k] 

lows, hence, from ()6.13|) . 

(b) (<^=) Assume \i is a trivial measure on T-^ and suppose that there exist 
s : < s < 1 and z/, z/ G such that \i = s v + (1 — s) v' . As u, v' <C //, 
by (c) (=>) there exist jF^-measurable functions h, h' > such that v = h\i 
and z/ = Zi'yU. But the triviality of /i on T-oq implies that h — h' — 1 /i-a.s. 
Thus fi = u = u'. 

(c) (<S=) This is an immediate consequence of Proposition 16.41 plus the fact 
that h is Z^- mA (/immeasurable for all A G S^. 

(d) Let /i, z/ G {?(/) such that /i = z/ on .FL^. Consider /£ = | /i + 1 z/ G {?(/)• 
Since /i -C Jl and z/ <C /i, assertion (b) implies that /i = fjl and v = gjl for 
^oo-measurable functions / and g. But \i — v = Jl on JF.^, so / = g /i-a.s. 
and therefore fi = v. 

(e) It is an immediate consequence of (b) and (d). □ 

6.3 Triviality and short-range correlations 

The proofs involve standard arguments. We include them for completeness. 
Proof of Proposition 13.71 

(a) =>■ (c) Let A G T and k G Z. Since .FLoo = fln>i ^<k-n, the reverse 
martingale theorem yields 

H(A | .F< fc _„) -^U n(A | ^ . (6.16) 

n— >+oo 

The assumed triviality of // on JF.^ implies that \i (A \ J--oo) = n(A) /i-a.s. 
We deduce that for each e > 0, there exists A G <S& such that 

/i(|/i(A| J- A _) -/i(A)|) < e. (6.17) 
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Hence, for all A G S b : A D A, 



sup 



u,{A n B) - n{A) n{B) 



< sup 



= fi{Jfi(A\^ A _)-(i(A)]l B 

< fj,(jfj,(A | F A _) -n(A) 

< e. 



(b) =^ (a) Fix £ G J^-oo and consider V = {A G T : //(A n 5) = /i(v4)/i (£?)}. 
It is straightforward to see that P is a A-system. By assumption D contains 
all cylinder events, so T> = JF [Dynkin's 7T-A theorem]. In particular B G T>, 
thus = (fi(B)) 2 and thereby = or 1. □ 

Proof of Theorem 13.81 

(a) Let h be a bounded local function on Q. As fi is consistent with /, fh n h 
coincides with /i (h | Jva„)_), Ai-a.s., for n sufficiently large. Therefore, by 
the reverse martingale convergence theorem we conclude that 



[i (h | T- 



/x-a.s. 



This implies assertion (a) because \x is trivial on T-oo- 

(b) It is a consequence of assertion (a) and the fact that if Q is compact and 
metric, the space of local continuous functions on Q contains a countable 
subset which is dense with respect to the uniform-norm. □ 



6.4 Ergodicity 

We need a well known result of ergodic theory. See, for instance, Georgii 
(1988), Theorem 14.5, for a proof. 

Theorem 6.18 

(a) A probability measure \i G Vi nv (Q, T) is extreme in V inv (fl, J 7 ) if and 
only if fi is ergodic. 

(b) Let [i G Vi nv (fl, J 7 ) and v G V(Q, J 7 ) such that v <C /i, then 
v G Vj nv (fl, J-) if and only if3h> 0, X-measurable : v = hfi. 

Lemma 6.19 

Let /i G "Pj nv (fi, JF), then 1 C F-oc /i-a.s. More precisely, for each A G X 
there exists B G F-oc such that /i(AAB) = 0. 
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Proof 

Let A G X and (B n ) n>1 be a sequence of cylinder sets such that fi(AAB n ) < 
2~ n for all n > 1. Since /x G Pi nv (r2,jF), we have that 

/ifWBj = ^(tW5„) = /x (AAB n ) < 2~ n 

for each i G N (r* is the zth-iterate of r). Consider A„ | Z such that 
B n G JF An . For each n > 1 we choose i(n) > such that A n fl (A n — i(n)) = 0. 
Each set C n = r l ^B n belongs to F{x n )_ and satisfies /x (AAC„) < 2~™. 
Therefore, the set C = f| m>1 Un>m belongs to J^-oo and satisfies 



fi(AAC) < a« 




Proof of Theorem 13.91 

(a) Let us consider the probability kernel T on T x Q defined by 

T(A | u) = 1 a (tu) 

for every A E F and every uGfl. 
To prove necessity we introduce 

m = ( n^ mA ^))n j ^)- 

\Aes b J 

By ()6.14j) and Lemma f6.19| K,{fi) is the //-completion of X. Therefore Corol- 
lary inHIl implies that each /x extreme in Q imr {f) is trivial on X. 

For the sufficiency, suppose that /x is trivial on X and consider a decom- 
position fj, = s v + (1 — s) z/ with < s < 1 and u, v' G Gwy{f)- Then 
there exist .F-measurable h, h' > such that u — hfi and z/ = aV/x. Since 
/x, zy, z/ G Pi nv (f2, JF), Proposition 16.41 applied to X^ (/x) implies that /x, aY are 
measurable with respect to the /x-completion of X. Hence the triviality of /x 
on X assure that h — h' — l /x-a.s. Thus /x = v = v' . 

(b) . Theorem 16.181 (b) implies that there exists h > 0, X-measurable such 
that z/ = Ax/i. By Lemma 16.191 ax is .F-oo-measurable, so Theorem 13.61 b) 
implies that v G G(f). Therefore v G Gi nv (f)- 

(c) It is an immediate consequence of (b). □ 
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7 Proofs on uniqueness 

7.1 One-sided boundary-uniformity 
Lemma 7.1 

If uniqueness condition \4. 2)) is satisfied, then v > c fi, V/i, v G <?(,/). 
Proof 

Let A be a cylinder set and n an integer such that (j4.2j) holds. If and v 
are consistent with /, 



u{A) = // / hnH (A|0/^(^)^) 



> C j j f[- n ,m]( A I V) K dT l) "(dO 

= cfi(A). □ 



Proof of Theorem 14.11 

We shall prove that every element of G{f) is extreme. Let /i G Q{f) and 
B G .F-oo such that /i(-B) > 0. Define 

A / I tj\ IB 

v = K-\B) = 

By Theorem l3.6l fc). v G {?(/)• By the preceding lemma = v(B c ) > c^i(B c ), 
so n(B) = 1. □ 

Proof of Proposition 14.41 

Call m(f) the infimum ()4.5j) and V(f) the supremum f)4.6j) . Through an 
elementary logarithmic inequality we have that for each i,j G Z with z > j 
and each £, 77 G fi<« with £j = 77*, 

> exp f_^fe)) . (7 . 2) 



Applying the factorization (|3.4|) we conclude that for each n, m G Z with 
n < m and each £, 77 G f2< m with £™ = 77^ 

> e -V(f)/m(f) ^ 



/n ) 



/[n,m] (Cn 


1 


f[n,m] yfn 
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7.2 Dobrushin uniqueness 

The following bound is the basic tool of the theory. 



Lemma 7.4 (Multisite dusting lemma) 

Let V G Sb, fv a probability kernel on T< mv 
estimator for f . Then, 



x Q and a v is a d-sensitivity 



Si (fvh) 



-- if j G V 

< 5 d 3 {h) + Y,S d k {h)a v k3 ifjeV-, 



(7.5) 



k€V 



for every continuous function h on V U V-. 



Remark 7.6 

The name of the lemma comes from a picturesque interpretation due to 
Michael Aizenman reported in Simon (1993): If the oscillations are inter- 
preted as "dust" and the averages fy as applications of a (multisite) "duster" , 
the lemma says that no dust remains in V after dusting the sites there [first 
line of (|7.5p ]. but the dust has been spread over the remaining sites [second 
line of (|7.5p ]. The estimators give the fraction blown from site to site. In 
this picture, Dobrushin condition ()4.15|) means that some dust stays in the 
duster, a fact that allows for an eventual total cleaning. 

Proof 

The first line in ()7.5|) just expresses the fact that the average fyh is Ty_- 
measurable. The second line shows two contributions: The first one due to 
the direct dependence of h on the configuration at the site j, and the second 
to the sensibility of the /y-averages to the configuration on the past instant j. 
To separate both contributions we introduce a family of auxiliary functions 
hy,uj (ov) — h \0Jy_ ay) for each w 6 fl ("freezing" at uj). For j E V- and 

£, rj G fV_ sucn that £ = rj, we have 
fv(h\0-fv(h\v) 



< 



o o o 

fv (h V) t -h v , v \t) + fv (hv,r, I - fv (hy, v \ rj) . (7.7) 

If we divide throughout by d(£j,r)j) and use the estimator bound (j4.13|) we 
obtain, upon taking the necessary suprema, the second line in (J7. 5)1 . □ 

We now fix a partition V of Z into finite intervals and denote, for each 
A C St, 

A* = \J{V G V : An V ^ 0} . 
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Let n(A) denote the number of elements of V forming A*. 
Proposition 7.8 

Consider a LIS f and d-sensitivity estimators a v for fy for each V G V. 
(i) For every j G A*_ and h G B d (A* U A* ), 



fceA* 



n(A) 

E ( p A*«)' 

Z=l 



(7.9) 



(ii) If Dobrushin condition (14. 15]) is satisfied, then for every j G A* and 
h€B d (A*), 



fceA* 



P\*« 
1 - P A .a 



(7.10) 



Proof 

We only need to prove ()7.9|) . Inequality ()7.10|) is then obtained by bounding 
the sum in the RHS of ()7.9j) by the limit n(A) — > oo, which is finite under 
Dobrushin condition. 

We proceed by induction on n(A). The case n(A) = 1 is just the multi- 
site dusting lemma. Suppose the inequality valid for all A with n(A) = n. 
Consider A such that A* = (J^i V u where the V t G V, i = 1, . . . , n + 1 
are labeled so that m v . = ly i+1 -i- Denote A* = lj" =1 Let j G A* 
and ft, G B d (A* U A*_). By the factorization property (J3.5j) of the LIS, 
Sj (fA*h) = 5j [fA*fv n+1 h)- Therefore, by the inductive hypothesis, 



SiiU-h) < Sf(f Vn+1 h) + J2 ifv n+ ih) 



fceA* 



i=i 



Ay 



and the multisite dusting lemma I7~H yields 



fceA* 



Z=l 



We now observe that, given the restrictions in the sites being summed over, 
we can replace in the RHS Pa* and Py n+1 by Pa*- Furthermore, for m G V n+ i, 
I G N, 



i=i fceVi 



a 



mfc 



(Pa* a) 



(P A * a 



,<+i 



J mj 



26 



The last two displays imply that 



fceA* 



'n+1 



2=1 



□ 



Proof of Theorem 14.141 

Let us label the elements of the partition so that V = {Vi : i G Z} and 
m Vi = W i+1 -i, i £ Let us denote V™^ = UjL m -i^ f° r ever Y integer 
n,m,i with m — i < n. Let /i, z/ G and consider a local function ft of 
(i-bounded variations. Pick m,n <E Z such that ft G £>d(V"). The consistency 
of both \i w\v with fv n _., for an integer z > 0, imply 



z/(ft) — n(h) 



< 



Therefore, by the continuity of / and (|7.10p . 

P A a 



je(v m -i)_ 



fceA je(V m -i)_ 



l-P A a 



Under condition (|4.15jl the series on the RHS is summable, hence the bound 
converges to zero as % — > oo. □ 



8 Proofs on loss of memory and mixing 

Proof of Theorem 15.31 and Proposition 15.241 

Part (i) of Theorem 15.31 is just (|7.9jl . The triangular property of F implies 
that for each i G A*, 

ieA* ieA* 

Therefore, 

jez 

Proceeding inductively we obtain 

[(^A*«)"] fcj < 7a* e"^ (8.1) 
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for every natural n. This yields ()5.25|) upon summation over n. Combining 
(Iq~231) with ff77TU|l . we obtain (pTKj) . □ 



Proof of Theorem 15.151 

Fix A G Sb and h G Bd(A). Using the consistency of /i and Jl respectively 
with / and /, we have that, for each nGff, 



fj,(h) - (J,(h) < fl (f[m A -n,m A ]h) - fJ, (f[m A -n,m A ]h) 



V> (f[m A -n,m A ]h) ~ /i ( f[m A -n,m A ]h) • (8.2) 



+ 



We estimate separately each term on the right as n tends to infinity. The 
compactness of Q implies that f[m A -n,m A .]{h\ctj) — > ^(h) for each u G Q 
as n — > oo (see Remark 14.8)1 . Therefore, by dominated convergence (h is 
continuous, hence bounded) 



fc) - A* (/[ 



[m A -n,m A ] 



h) 



■* o 



(8.3) 



To bound the last term in (|8.2j) we telescope using the factorization prop- 
erty (jS3J) for LIS: 



V\f[m A -n,m A ]h) - ^{f[m A -n,m A ]h) < fi(f{m A }h) - fi(f{m A }h) 



771A-1 



+ \^(f[k,m A ]h) - ^(f{k}f[k+l,m A ]h) 

k=m A —n 



■4) 



The definition f|4.16j) / f|4.17jl of the VKR distance, implies that 

o 

-(/**)(") I < S d k (g)\\f° {k} (-\u)-f {ky ('\uj) 
for all k G Z, G fil^ and g G i3^(] — oo, A;]). Hypothesis ()5.16|) implies 

Hf{k}9-f{k}9) < $(b k )6i(g) . (8.5) 
Combining (|8.4|) and ()8.5|) we obtain 

t^\f[m A —n,m A ]h) — ^{f[m A -n,m A ]h) 
m A -l 

£ ?( 6 *) ^(W*]) + #( 6 *) #00 • ( 8 - 6 ) 



k=m A —n 
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To obtain ()5.17|) we insert this bound in (|8.2jl . let n tend to infinity and use 
(P . □ 



Proof of Theorem 15.181 

Fix A, A G iSb with tua < /a, /?-i G Bd(A) and /i 2 G £>d(A). Without loss, we 
can suppose that h 2 > 0, /12 ^ and /z(/i2) = 1 since both sides of ()5.19j) are 
invariant under adding a constant to /12 and both multiply in the same way 
if h 2 is multiplied by a positive constant. We then can write 

Cor^/ix,^) = |z/(M-MM| (8-T) 

where z/ is the probability measure defined by 

v = h 2 /i. (8.8) 

1st stage: We construct a LIS / for v on ] — 00, tha] • For every k G] — 00, toa] , 
let us define 

fk = gkf{k} (8.9) 

with 

{1 if k G [mA + 1, mA] 

f[k + l,m A] {h I ■ ) (8.10) 
— v . 1 if fc G] - 00, m A ] • 

J[fc,m A ] (^2 I • ) 

The function g k is well defined because /[fc, mA ]^2 7^ for every k g] — 00, i]. 
Indeed the existence of k such that f[k, q ]h2 = would imply, by consistency, 
that fi(h 2 ) = 0. This contradicts the fact that fi(h 2 ) = 1. It is clear that 
the kernels fk satisfy the hypotheses of Theorem (|3.1|) . hence they uniquely 
define a LIS / on ] — 00, mA]. The same theorem shows that the consistency 
of v with each fk, k G] — 00, mA] is all that has to be checked in order to 
prove that v is consistent with /. 

If k G [mA + 1,771a], this consistency is a consequence of the following 
sequence of identities, valid for every h G T<k- 

v(fk{h)) = ^h 2 f {k} (h)) = /i(f {k} (h 2 h)) = v(h 2 h) = u(h). (8.11) 

The third inequality is due to the T<k-\ measurability of h 2 and the fourth 
one to consistency. 

For k G] — 00, mA] we observe that for h G T<k, 

v (fk{h)) = ^{.h 2 f [k} (g k h)) = fi(f[ k , mA ][h2f{k}(gkh) 
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the last inequality being a consequence of the consistency of \i with /. Upon 
inserting the definition of g k [second line in f!8.10|) ] we see that there is a 
term f[k,m A ] in the denominator that can be pulled to the left because of its 
^fc-i-measurability. This produces a cancellation with an analogous term 
in the numerator. We thus obtain 



v(fk{h)) = fiyf{k}[hf[ k , mA ](h 2 )]J = fi^f{k}[f[k, mA ](h 2 h)] 

= fi(h 2 h) = v(h) . (8.12) 

The third inequality is due to the .T-^-measurability of h and the fourth one 
to the consistency of \i with /. Identities (j8.11|) and (j8.12|) prove that v is 
consistent with / on ] — oo, 77^]- 

2nd stage: For every G AUA_ and uj £ we construct b k {u)) such that 

o 

f k (.\uj)-f k (-\u) < b k (u). (8.13) 

d 

For starters, we can take 

b k = V k £ [m A + 1, m A ] , (8.14) 



-OO 5 - 



because f k (■ \ u) =fk ( ■ | to), for k £ [m^ + 1, and lu £ Cl_ 

We fix k £ A U A_ and lu £ and consider the set fi£ = {u k £ 

fl{k} '■ uJ-oo ^-oo) "with the restricted topology and Borel cx-algebra. To 
abbreviate the notation we introduce the function u : f2£ — » K defined by 



and the measure 



/[fc+l,m A ] (^2 




/[fc,m A ] (^2 





« = /k(-|w) (8.16) 



on Notice that 



/*(•!<*>)-/*(• | u;) = ««-«■ (8-17) 
We also denote, for each ^-j^-measurable function /i, 

a h(x) + h(y) 

™h = sup — — r— , 

x&/ 2d(x,y) 



and observe that 



\h-mkDW,, < §5 d k (h) (8.1S 
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We affirm that 



.//,. <-M- ,/;,(• M < £ a (|«-l|). 

d L 



(8.19) 



Indeed, for h G Bd({k}) with 5%(h) < 1 we have 

a(h) — a(h) = a\(u — l)(/i — m^D)] < a(\u — 1|) \\h — rrihD\ 



u i 



From this and (|8.18jl . assertion ()8.19|) follows. 
We now use Schwarz's inequality to bound 

a(\u— 1|) = a(\u — Oi(u)\) < a((u — a(u)) 2 
and since a(u) minimize x i — > a ((it — x) 2 ) , we obtain 



a(|«-l|) 



< \\u-muDWoo 



a y[u — m u D 

The combination of (j8~TTI|) and (|8~2"UJ) gives (j8~T^|) with 

D 2 5f( M ) D 2 51 {f [k+1 , mA] h 2 ) 



b k (u) 



4 f[k,m A ]h 2 (uj) 



(8.20) 



(8.211 



5rc? stage: We estimate v(pk). From (|8.21J) : 

By consistency, \i = fi f[k, mA \, hence the last factor is just 1. From this and 
()8.14j) we conclude that 



v{b k 



if k G [m,A + 1, m\] 

D 2 

-j- St (f [k+ i,m A )h2) if A; G A U A_ . 



(8.22) 



In view of (|8~7|) . (l8~T3l and (l8~2"2l) imply (lo~T9T) by Theorem EH □ 



Proof of Corollary 15.211 

Part (i) follows from (TTTUJ) and (lo~TTIJ) . and part (ii) from (J7H) and (lo~2"2T) . 

□ 
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